There is tremendous potential in AI, but:
Read more: https://www.statnews.com/
From Recital 71 EU GDPR
,,(71) The data subject should have the right not to be subject to a decision, which may include a measure, evaluating personal aspects relating to him or her which is based solely on automated processing and which produces legal effects concerning him or her or similarly significantly affects him or her, such as automatic refusal of an online credit application or e-recruiting practices without any human intervention.
Such processing includes ‘profiling’ that consists of any form of automated processing of personal data evaluating the personal aspects relating to a natural person, in particular to analyse or predict aspects concerning the data subject’s performance at work, economic situation, health, personal preferences or interests, reliability or behaviour, location or movements, where it produces legal effects concerning him or her or similarly significantly affects him or her.
However, decision-making based on such processing, including profiling, should be allowed where expressly authorised by Union or Member State law to which the controller is subject, including for fraud and tax-evasion monitoring and prevention purposes conducted in accordance with the regulations, standards and recommendations of Union institutions or national oversight bodies and to ensure the security and reliability of a service provided by the controller, or necessary for the entering or performance of a contract between the data subject and a controller, or when the data subject has given his or her explicit consent.
In any case, such processing should be subject to suitable safeguards, which should include specific information to the data subject and the right to obtain human intervention, to express his or her point of view, to obtain an explanation of the decision reached after such assessment and to challenge the decision.’’
When we think about the interpretability of models we usually distinguish three classes of methods
Students A, B and C carry out a project together. With this payoff table, determine what portion of the award each student should get.
Students A, B and C carry out a project together. With this payoff table, determine what portion of the award each student should get.
Students A, B and C carry out a project together. With this payoff table, determine what portion of the award each student should get.
Students A, B and C carry out a project together. With this payoff table, determine what portion of the award each student should get.
\[ \phi_j = \frac{1}{|P|!} \sum_{\pi \in \Pi} (v(S_j^\pi \cup \{j\}) - v(S_j^\pi)) \]
where \(\Pi\) is a set of all possible permutations of players \(P\) while \(S_j^\pi\) is a set of players that are before player \(j\) in permutation \(\pi\).
\[ \hat\phi_j = \frac{1}{|B|} \sum_{\pi \in B} (v(S_j^\pi \cup \{j\}) - v(S_j^\pi)) \]
Let’s start with local explanations, focused on single point \(x\) and the model prediction \(f(x)\).
Now instead of players, you can think about variables. We will distribute a reward between variables to recognize their contribution to the model prediction \(f(x)\).
age, which means conditioning the data with the condition age=8.class=1st. In the next step, we add fare to the coalition, and so on.
class variable to a coalition with the age variable increases the reward by \(0.086\).
Desired characteristics of explanations (from LIME paper)
The core ideas behind LIME are:
The explanation will be a model \(g\) that approximates the behavior of the complex model \(f\) and is as simple as possible
\[ \hat g = \arg \min_{g \in G} L\{f, g, \pi(x)\} + \Omega(g) \]
where
Explanations can be calculated with a following instructions.
sample_around(x’)similarity(x’, z’[i])K-LASSO(y’, x’, w’)where
similarity – a distance function in the original data spaceK-LASSO – a weighted LASSO linear-regression model that selects K variables
Let’s see how LIME can be used to solve this problem.
Initial settings
Interpretable data space
Sampling around x
Fitting of an interpretable model
How to transform the input data into a binary vector of shorter length?
The LIME method was designed to explain the model’s behavior locally, around the observation of interest. But we are often interested in knowing or at least getting an intuition about how the model works globally.
Assuming the user has time to look at LIME explanations for B observations, the question is how to select them.
Submodular pick (SP) algorithm
The LIME paper presents a user-study example where the submodular picks method most effectively convinces the user how the model works.
Ceteris paribus is a Latin phrase, meaning ‘’all other things being equal’’ or ‘’all else unchanged’’, see Wikipedia.
It is a function defined for model \(f\), observation \(x\), and variable \(j\) as:
\[\begin{equation} h^{f}_{x,j}(z) = f\left(x_{j|=z}\right), \end{equation}\]
where \(x_{j|=z}\) stands for observation \(x\) with \(j\)-th coordinate replaced by value \(z\).
The Ceteris Paribus profile is a function that describes how the model response would change if \(j\)-th variable will be changed to \(z\) while values of all other variables are kept fixed at the values specified by \(x\).
In the implementation we cannot check all possible zs, we have to meaningfully select a subset of them.
Note that CP profiles are also commonly referred as Individual Conditional Expectation (ICE) profiles.
\[ g^{PD}_{j}(z) = E_{X_{-j}} f(X_{j|=z}) . \]
\[ \hat g^{PD}_{j}(z) = \frac{1}{n} \sum_{i=1}^{n} f(x^i_{j|=z}). \]
Pros
Cons
Today we have discussed the three fundamental methods of explaining the behavior of predictive models.
SHAP is a method that allows us to explain the behavior of the model by decomposing the distance between this particular prediction and the average prediction of the model.
LIME is a method that allows us to explain the behavior of the model by approximating it with a linear surrogate model.
CP and PD are methods that allow us to explain the behavior of the model by tracing the model response along changes in a single variable.
All these methods can be used to explain the behavior of the model globally, but also to compare models with different structures.
The choice of the method depends on the problem we are dealing with, the structure of the model, and the preferences of the user.